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ABSTRACT 

Normal  probability  plotting  of  the  orthogonal  contrasts  (Daniel  (1959)) 
is  a  useful  diagnostic  as  well  as  inferential  technique  for  analyzing 
unreplicated  factorial  experiments.  In  particular,  the  presence  of  one  or 
more  faulty  observations  is  marked  by  a  characteristic  defect  of  the  normal 
plot.  In  previous  work  (Box  and  Meyer  (1985)),  the  authors  introduced  a  more 
formal  method  of  identifying  active  effects  as  a  supplement  to  Daniel's 
graphical  analysis.  This  work  is  extended  here  to  allow  for  the  possibility 
of  faulty  observations.  . 
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SIGNIFICANCE  AND  EXPLANATION 


It  is  well  known  that  analysis  of  unreplicated  factorial  experiments  is 
quite  sensitive  to  faulty  values  among  the  observations.  These  may  be  due  to 
an  oversight  in  experimental  procedure,  a  gross  measurement  error,  or  a 
mistake  in  recording  the  observation.  A  graphical  technique  due  to  Daniel 
(1959)  has  been  used  successfully  to  detect  such  faulty  observations.  When 
the  presence  of  such  values  is  suspected,  the  model  previously  introduced  by 
Box  and  Meyer  (1985)  can  be  extended  to  accommodate  the  bad  values. 
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ANALYSIS  OF  UNREPLICATED  FACTORIALS  ALLOWING  FOR 
POSSIBLY  FAULTY  OBSERVATIONS 

George  E.  P.  Box  and  R.  Daniel  Meyer 

1.  Introduction 

Normal  probability  plotting  of  orthogonal  contrasts  (Daniel  (1959))  has 
become  a  standard  technique  for  interpreting  and  criticizing  unreplicated 
factorial  and  fractional  factorial  experiments.  While  the  primary  objective 
of  the  normal  plot  is  to  determine  which  effects  are  distinguishable  from 
noise,  Daniel  has  pointed  out  that  this  is  not  its  only  function.  Various 
departures  from  assumptions  may  be  detected  by  critical  inspection  of  the 
normal  plot.  In  particular,  the  presence  of  one  or  more  faulty  observations 
is  marked  by  a  characteristic  pattern  among  the  plotted  points. 

1.1  An  Example 

To  illustrate,  consider  the  data  in  Table  1  for  a  full  24  factorial 
experiment  taken  from  Box  and  Draper  (1985).  With  the  factors  denoted  by 
1,  2,  3  and  4  this  shows  the  design  array,  the  original  observations  and  the 
estimated  effects.  A  normal  plot  of  the  effects  is  shown  in  Figure  1.  From 
this  plot  it  will  be  seen  that  while  the  main  effects  2  and  3  are  largest  in 
absolute  magnitude  they  do  not  deviate  very  much  from  a  line  drawn  through  all 
the  remaining  points.  One  would  hesitate  therefore  to  conclude  on  this  basis 
that  they  were  distinguishable  from  the  noise.  There  is,  however,  another 
feature  of  the  plot  which  bears  further  consideration.  The  points  falling 
near  zero  appear  to  follow  two  different  parallel  lines  rather  than  one,  with 
negative  values  on  one  line  and  positive  values  on  the  other.  Daniel  points 
out  that  such  behavior  suggests  the  possibility  of  a  faulty  observation.  He 
also  suggests  (Daniel  (1976))  how  that  observation  may  be  identified  using  the 
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fact  that  if  a  particular  observation  is  biased  by,  say,  a  positive  amount, 
those  contrasts  in  which  the  observation  enters  positively  are  shifted  to  the 
right,  and  those  contrasts  in  which  the  observation  enters  negatively  are 
shifted  to  the  left.  This  produces  a  "gap"  such  as  the  one  seen  in  Figure  1. 
Thus  the  observation,  if  it  exists,  which  enters  positively  in  the  small 
positive  contrasts  and  negatively  in  the  small  negative  contrasts  will  be 
under  suspic ion . 

Examination  of  the  design  array,  Table  1,  shows  that  the  row  of  signs 
corresponding  to  y-j-j  =  59.15  matches  the  signs  of  all  contrasts  save  one, 
suggesting  that  this  thirteenth  observation  is  too  large.  In  practice,  such  a 
discovery  should  lead  to  reconsideration  of  the  data  and  in  particular  of  what 
effects  might  show  up  if  this  observation  were  appropriately  adjusted.  In  an 
ongoing  investigation  it  should  lead  the  experimenter  to  consider  any  special 
circumstances  which  might  have  surrounded  the  making  of  this  observation  and 
possibly  to  a  repetition  of  the  observation  or  of  some  selected  part  of  the 
design  involving  this  observation. 

2.  A  More  Formal  Solution 

The  remainder  of  this  paper  summarizes  some  recent  work  on  a  more  formal 
study  which  however  closely  follows  the  spirit  of  Daniel's  analysis.  We 
emphasize  that  we  regard  this  work  not  as  replacing  this  analysis  but  as 
perhaps  a  useful  adjunct  to  it. 

The  possibility  of  model  inadeguancies  poses  a  dilemma  for  the 
experimenter  rather  like  that  faced  by  a  small  country  which  believes  itself 
in  danger  of  air  attack  and  wonders  how  it  should  spend  a  limited  budget  on 
radar  apparatus.  While  some  resources  should  be  spent  on  highly  directional 
radars  to  monitor  with  great  sensitivity  the  direction  (or  directions) 


regarded  as  most  likely,  it  might  be  wise  to  spend  the  rest  on  nondirectional 
instruments  which,  while  less  sensitive,  could  monitor  the  whole  horizon. 
Graphical  analysis  can  perform  a  task  like  global  radar  making  it  possible 
that  the  investigator  is  alerted  to  contingencies  not  initially  bargained  for 
(see  also  Box  (1980)). 


2.1  A  model  based  on  the  effect  sparsity  hypothesis 

Clearly  implied  by  Daniel's  normal  plot  analysis  is  a  hypothesis  of 

effect  sparsity  -  that  most  of  what  is  occurring  can  be  accounted  for  by  a  few 

active  effects.  Suppose  X  is  the  n  *  n  design  matrix  from  which  the  n  -  1 

usual  estimated  effects  are  calculated,  and  y  is  the  n  x  1  vector  of 

observations.  If  x(a)  denotes  the  columns  of  X  which  correspond  to  active 

effects  T.  .,  then  y  may  be  described  by  the  relationship 
\  / 

y  =  X/  t  ,  +  e 
(a)  (a) 

where  e  is  the  n  x  1  vector  of  normally  distributed  errors  with  zero  mean 
2 

and  variance  o  .  Let  be  the  prior  probability  that  an  effect  is  active, 

and  let  a^j  be  the  event  that  a  particular  set  of  r1  of  the  n  -  1 


effects  is  active;  x(r.j)  an<*  T(r  )  are  t*'e  co^uinns  x  and  the  effects 

corresponding  to  a(r^.  The  prior  distribution  of  each  active  effect  t, 

2  2  2 
given  0  ,  is  an  independent  normal  with  mean  zero  and  variance  Y  o  ;  the 

prior  distributions  of  the  mean  tq  and  log  (a)  are  locally  uniform  (see, 

e.g..  Box  and  Tiao  (1968),  (1973)).  Thus  the  posterior  probability  of  the 

event  a#_  .  can  be  written 

,1a 


pU(r.)fy)  *  Y 


1  ‘rl  1X(0)X(0) 


|r  +x  •  x  Y2 

1  r,  (r,)  (r,)1 


S1V/  *  vyr,1!.,) 


-U-D/2 


-5- 


where 


identity  matrix 


Then,  for  example,  the  posterior  probability  that  an  effect  i  is  active  is 

p,  =  Pteffect  i  active  j  y]  =  £  p(a, 

(r^ ) :i  active  1 

In  an  earlier  paper  Box  and  Meyer  (1985)  follow  such  an  approach  to  provide  an 
alternative  means  of  locating  active  effects  and  they  show  that  the 
statistical  literature  suggests  average  values  for  these  parameters  of  = 

0.2, 

Y  =  2.5.  They  furthermore  show  that  the  conclusions  about  which  effects  are 
active  are  usually  insensitive  to  variations  over  the  ranges  of  values  of 
(a.j,Y)  which  appear  to  be  actually  encountered. 


r  =  i- 
'l  Y2 


0  Q ' 

0  X 

r1 


,  I  =  r  X  r, 
r  1  1 


T/  >  =  (r  +  x,  \x,  > rv  \  y 

( r  ^ v  r  ( r  1 )  (r^  ‘  (r^ 

S(T,  )  =  (y-X,  .T  ) '  (y-X  ,T  ) 

( r  t w  ( r  1 )  ( r  1 )  ^  w  (r^  (r^  ‘ 


2.2  Faulty  observations 

To  allow  for  the  possibility  of  faulty  observations  (Meyer  and  Box 

M985)),  we  suppose  that  the  errors  associated  with  such  values  have  an 

2  2 

inflated  error  variance  k  o  (k>1)  and  occur  with  some  small  probability 

a^.  Thus  the  error  e  is  supposed  to  follow  the  scale-contaminated  normal 

distribution  ( 1-a )N(O,0^)  +  a  N(0,k^o^).  Let  a/_  .  be  the  event  that  e 

2  2  'ri,r2) 

particular  set  of  r^  effects  are  active  and  a  particular  set  of  r2 
observations  are  faulty  X.  _  \  is  the  matrix  of  columns  and  rows  of  X 

' r i » r2  / 

corresponding  to  active  effects  and  faulty  observations,  and  t^le 

elements  of  y  supposed  to  be  faulty.  Then  the  posterior  probability  of  the 

event  a,_  _  .  can  be  written 

'r1»r2' 
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a  r1  a  r2  -r  -r 
PU(t,,r2l  1  »>  “  Y  k 


_ *  x(0 )x( 0  )  j _ 
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where 


=  1  - 


;(r1(r2)  =  ^r;Xiri)X(ri)-*X;r,)r2,X(ri(r2))'1(X;r,)y-»X,r1(r2,^,r2,) 
S^T(r1,r2^  X(r1  )T(r1  ,r2)^  ^y  X <r  1  )T  ,r2  )  ^ 

^Y(r2)  X(r1,r2)T(r1  ,r2>^  ^y"X(r1  ,r2  )T  (r1  ,r2  )  ^ 

Then,  for  example,  the  posterior  probability  that  effect  i  is  active  is 

Pi  =  l  P(a{r  r  Jy> 

(r, ,r„ ) ji  active  'W 


and  the  posterior  probability  that  observation  yj  is  faulty  is 


P(a(r„rj|y>  ’ 


(r  ^  ,r2  )  :y_.  faulty  1  2 


Computing  the  {p.}  and  {q.}  over  all  combinations  alr  _  ,  will 
i  1  ' 1 1 » r2  ' 

generally  not  be  feasible.  Instead  we  employ  the  following  iterative 
approximation.  We  first  compute  the  probabilities  {p^}  assuming  there  are 
no  faulty  observations.  Then  temporarily  choose  the  active  effects  as  those 
with  Pi  >  P.  The  probabilities  {q^}  are  computed  with  the  active  effects 

held  fixed  by  the  above  choice.  The  probabilities  {p^}  are  then  recomputed, 

2  2 

assuming  all  observations  with  >  Q  to  have  variance  k  o  ,  and  so  on. 

In  most  cases  convergence  is  achieved  in  one  or  two  iterations,  with  p  =  g  =  0.5. 
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Alternatively,  P  and  Q  may  be  chosen  after  observing  the  results  of  the 
first  iteration  as  a  more  exploratory  approach.  As  computing  power  increases, 
the  simultaneous  summation  over  all  combinations  of  active  columns  and  faulty 
observations  will  be  the  most  desirable  method  of  computation. 

Analysis  of  data  assuming  no  possibility  of  faulty  observations 

Figure  2  shows  the  posterior  probabilities  {p^}  for  the  data  of  table  1 
with  a1  =  0.2,  y  =  2.5  when  we  do  not  allow  for  faulty  observations 
(a^  =0).  The  probabilities  suggest,  as  did  the  normal  plot,  that  although 
main  effects  2  and  3  are  largest  in  absolute  magnitude  the  evidence  for  these 
effects  being  active  is  rather  slight. 

Analysis  of  data  assuming  faulty  observations  possible 

Earlier  work  (Chen  and  Box  (1979))  suggested  that  this  kind  of  analysis 
for  faulty  values  is  chiefly  affected  by  the  parameter  G  =  ctjk  Vd-otj)  and 
that  it  is  fairly  insensitive  to  change.  Relying  on  this  work  we  employ  the 
values  =  0.05  and  k  =  5.  In  practice  the  analyst  may  use  the  computer 
to  experiment  somewhat  with  other  values  and  thus  to  check  on  the  stability  of 
the  conclusions. 

In  Figure  3,  using  =  0.05  and  k  =  5,  the  posterior  probabilities 
{p^}  and  {q j }  are  plotted.  The  value  of  q13  is  very  close  to  one, 
suggesting  strongly  that  observation  y^  is  faulty.  The  affect  on  the 
probabilities  {p^}  of  the  automatic  downweighting  of  y^  achieved  by  this 
analysis  is  to  make  the  posterior  probabilities  for  main  effects  2  and  3  much 
closer  to  one,  and  the  probabilities  for  interactions  13  and  134  also  much 
larger.  The  conclusions  are  similar  to  those  suggested  by  a  Daniel  plot  with 
data  in  which  y^  has  been  suitably  adjusted.  Revised  values  of  the 
estimated  effects  are  given  in  Table  2. 
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Figure  2. 


Posterior  probabilities  {p^}  that  each  effect  is  active, 
assuming  no  faulty  observations,  with  -  0.2,  y  -2.5. 


1.0 
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figure  3.  a)  Posterior  probabilities  [p4 }  that  each  effect  is  active, 

2  2 

assuming  •  59.15  has  variance  k  o  ,  with  “  0.2, 

T  -  2.5,  k  -  5. 

b)  Posterior  probabilities  (q^J  that  each  observation  is  faulty 
assuming  main  effects  2  and  3  and  interactions  13  and  134  are 
active,  with  t  -  2.5,  o^  “  0.05,  k  ■  5. 
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Table  2.  Posterior  probabilities  {p^}  and  Bayesian  estimates 


of  effects  with 

a1  =  0.2,  Y  =  2. 

5,  <&2  =  0.05, 

k  =  5. 

assuming  no  outliers 

allowinq  for 

outliers 

estimated 

post. 

estimated 

post. 

column 

effect 

prob. 

effect 

prob. 

1 

0.79 

.029 

0.25 

.029 

2 

-4.18 

.557 

-3.17 

.960 

3 

3.67 

.432 

2.68 

.931 

4 

1.00 

.032 

-0.02 

.026 

5 

0.90 

.031 

-0.13 

.026 

6 

-2.47 

.151 

-1.49 

.628 

7 

-0.57 

.027 

0.48 

.043 

8 

-0.79 

.029 

0.25 

.029 

9 

-1.17 

.036 

-0.17 

.028 

10 

1.48 

.046 

0.52 

.051 

11 

1.19 

.036 

0.19 

.028 

12 

0.71 

.028 

-0.33 

.032 

13 

0.40 

.025 

1.42 

.587 

14 

-1.56 

.051 

-0.62 

.069 

15 

1.50 

.048 

0.55 

.056 

Conclusion 

We  feel 

that  with  the  increase  in  computational  power  now 

becoming 

available  analysis  of  the  kind  we  suggest  here  is 

a  practical  possibility 

Furthermore 

experimentation  with 

the  parameters 

<Xj,  a2,  Y  and  k  can 

indicate  to  what  extent  the  conclusions  are  insensitive  to  reasonable  changes 


in  the  probability  model.  Experience  may  show  that  such  analysis  can  usefully 
augment  the  highly  successful  graphical  methods  of  Daniel. 
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